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1 Introduction 

In many random combinatorial problems, the distribution of the interesting statistic is the law of an 
empirical mean built on an independent and identically distributed (i.i.d.) sample conditioned by 
some exogenous integer random variable (r.v.). In general, this exogenous r.v. is also itself a sample 
mean built on integer r.vs. Hence, a general frame for this kind of problem may be formalized 
as follows. Let {qn) be a positive integer sequence. Further, let X = {Xj"'^)neN*,j=i,...,nq„ and 

Y = {Yj"'^)neN*,j=i,...,nq„ be two triangular arrays of random variables. Both arrays are such 
that on their lines the r.vs are i.i.d.. Moreover, it is assumed that the elements of the array 
X are integer. The interesting distribution is then the law of (ng„)~^T„ := {nqn)^^ Yl^=i^j"^^ 
conditioned on a specific value of := X]j=l -^j"^- That is the conditional distribution 

Cn := £((ngn)"^r„|5„ = npn), 

where is some given positive integer sequence. When the distribution of {Xj^\Yj^^) does 
not depends on n, the Gibbs conditioning principle ( [22[ [5l [6]) states that converges weakly 
to the degenerated distribution concentrated on a point x depending on the conditioning value 
(see Corollary 12. 6p . Around the Gibbs conditioning principle, general limit theorems yielding the 
asymptotic behavior of the conditioned sum are given in |2H [T3\ I17j . Asymptotic expansions 
for the distribution of the conditioned sum are proved in [111 I18j . In this paper our aim is 
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to prove a large deviation principle for £„. Roughly speaking, this means that we will give 
an exponential equivalent for this conditional distribution. On a finer scale, we prove a large 
deviation principle for £„ := C{^J^-{Tn — hn)\Sn = npn), where bn is a centering factor specified 

in Theorem 12.41 of Section 12.2.31 and a„ is a decreasing positive sequence of real numbers with 
On 0, nanQn ^ +00. We then say that £„ satisfies a moderate deviation principle [U Section 
3.7]. Our work follows the nice ones of Janson [15lll4j . In these last papers, a central limit theorem 
with moment convergence is proved. The starting point in the proof is a simple representation of 
the conditional characteristic function as an inverse Fourier transform. This representation was 
first given by Bartlett [H Equation (16)]. To establish large and moderate deviation principles 
we will make use of Gartner-Ellis Theorem in which an asymptotic evaluation of the Laplace 
transform is needed. For this purpose, we first transcribe the Bartlett formula to get a simple 
integral representation for the conditional Laplace transform (see Lemma l3.ip . The main result 
of [15] is quite general as it only requires assumptions on the three first moments of (X, Y). Here 
we need further assumptions. However, contrarily to }15l Section 2], we do not restrict to the 
central case (conditioning on 5„ = E(5„)) nor on the "pseudo" central case (conditioning on 

Sn = IE(5n) + O (^y/nq^a ^(n)^ , with cj^ = Var(X)). In [18j, the authors study general saddle 

point approximations for multidimensional discrete empirical means and obtain an approximation 
formula for conditional probabilities. We focus here on the exponential part of this formula, stating 
a full large deviation principle (see Theorem 12. 3p . Using some classical tools of convex analysis 
we give an explicit natural and elegant form for the rate function. Furthermore, we complement 
our study by stating a moderate deviation principle for the conditional law (see Theorem 12. 4p . As 
usual, the rate function is quadratic and the scaling factor is the asymptotic variance, which can 
be interpreted here as a residual variance in some linear regression model, generalizing the factor 
found in [i5\. The paper is organized as follows. In the next section, to be self contained, we first 
recall some classical results on large deviation principles. Then we state our main results: a large 
deviation principle and a moderate deviation principle for conditioned sums. Section [3] is devoted 
to the proofs. In Section H] we apply our main results to some combinatorial examples. We also 
discuss possible extensions to more general models. 



2 Main results 

2.1 Large and moderate deviations 
2.1.1 Some generalities 

Let us first recall what is a large deviation principle (L.D.P.) (see for example [6l [l2]). In the 
whole paper, (a„) is a decreasing positive sequence of real numbers with limji^oo On = 0. 

Definition 2.1. We say that a sequence (Rn) of probability measures on a measurable Hausdorff 
space {U,B{U)) satisfies a LDP with rate function I and speed (an) if: 

i) I is lower semi continuous (Isc), with values in U {-|-cxd}. 

ii) For any measurable set A of U : 

— /(intj4) < liminf a„ log -R„(A) < limsup a„ log i?„(j4) < — /(c1oj4), 
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where I (A) = infgg/^I(,^) and intA (resp. do A) is the interior (resp. the closure) of A. 

We say that the rate function I is good if its level set {x € U : I{x) < a} is compact for any 
a > 0. More generally, a sequence of U -valued random variables is said to satisfy a LDP if their 
distributions satisfy a LDP. 

To be self-contained, we also recall some definitions and results which will be used in the 
sequel, (we refer to [SI [12] for more on large deviations). 

Laplace and Fenchel-Legendre transforms To begin with, let Z he a non negative integer 
random variable and define the span of Z by mz '■= sup{m G N, 3b G N, Supp(Z) C mN + 6}. Let 
ifz denote the characteristic function of Z. When Z is square integrable, cr| denotes its variance. 
For r lying in dom ipz := {t € M : E[exp(rZ)] < +cxd}, we define ipziT) '■= logE[exp(rZ)] as the 
cumulant generating function of Z. Obviously, dom ■0^ contains at least R~ and ipz is analytic 
in the interior of dom ipz- We denote by Rz the interior of the range of t/'^- ^^^^ known 

that Rz is a subset of the interior of the convex hull of the support of Z. These two subsets of 
M coincide whenever ■i/'^ is essentially smooth (see definition below). Further, let ip"^ denote the 
Fenchel-Legendre transform of ^pz [3 Definition 2.2.2 p. 26]. For any r* G Rz, there exists a 
unique Tr* € dom ipz such that ijj'zi'^r*) = t* and we may define Z*''^ as a r.v. on N having the 
following distribution 

F{Z*'^* = k) = e^p[kTr* -ipz{Tr*W{Z = k), (keN). (1) 

It is well known that E (Z*'^*) = r*. For more details on the relationships between il^Zjipz^ ^*''^ 
we refer to the book |2j. 

Let now {Z,W) be a random vector of M?. We naturally extend some of the previous notations 
to {Z,W). For example, il^z,w is the cumulant generating function built on (Z,W) defined on 
dom Tpz,w C and ip'^ denotes the Fenchel-Legendre transform of (Z, W\ 

Convex functions Let / be a proper convex function on M*"'. That is / is convex and valued 
in MU {+00} . We say that / is essentially smooth whenever it is differentiable on the non empty 
interior of dom/ and it is steep. That is, for any vector c lying on the boundary of dom/ 

hm ||V/(x)|| = +00, 
x^Cjxeintdom/ 

where V/(x) denotes the gradient of / at point x. 
Gartner-Ellis Corollary. 

Corollary 2.2. [Gartner- Ellis, fd^ Theorem 2.3.6 c) p. 44]] Let (Zn) be a sequence of random 
variables valued in M, {an)n 0, decreasing positive sequence of real numbers with limn^oo = 0. 
Define An{9) = InEe^^". Assume that 

1. for all 6* G M, a„ Kn{6/an) A{e) g] - 00, +00], 

2. lies in the interior of dom{A{9)) and A(9) is essentially smooth and lower semi continuous. 
Then (Zn) satisfies a LDP with good rate function A* and speed an. 
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2.2 Main results 



2.2.1 The model 

For n G N*, let be a random vector with G N. We assume that mx(n) = 1 and 

that converges in law to {X, Y) where X is a non essentially constant non negative 

integer valued r.v. having also span 1. Note that it implies that tpx is strictly convex and that 
X(") is not essentially constant and that strictly convex for n large enough. Further let 

({X^"'\Y^^^)j be an i.i.d. sample having the same distribution as (xH,yH). 

\ / l<i<n 

Let, for n G N* and g„ e N*, 5„ = xj""^ + • • • + X^'J^ and r„ = y/"^ + ■ • • + yi£. In the whole 
paper pn will be a sequence of positive integers such that P(S'„ = npn) > 0. 

2.2.2 Large deviations 

Theorem 2.3. Let p,q,Pn,qn € I^* such thatpn/qn € Rx(n) ~^ v/q € Rx- Assume that 

1. the function tpxx is essentially smooth, and let r be the unique real such that '(^'■^{t) = p/q, 

2. domipY = domipyin) = 

3. there exists r > such that It- := [t — r, r + r] C {dom ipx) H (n„>idom ipx(")) ^'^'^ 

Vn G M , Vs G 7, , sup E re(^*+^)^^"^+«^^"> - e('t+s)x+uYl (2) 

Then the distribution of (Tn/nqn) conditioned by the event {Sn = npn} satisfies a LDP with good 
rate function ip'^yip/Qi ~ i^*x{p/Q) ^'^^ speed {nqn)~^ ■ 

2.2.3 Moderate deviations 

Let ^ lying in the interior of dom^x and consider the random vector (X^,y^) whose distribution 
is given, for any A; G N and real Borel set A, by 

P(X5 = k,%eA) = expi-V'xie) + = k,Y eA). (3) 

We define in the same way the random vector (X^"^ , 1^^"^ ) . Obviously, X^ has the same distribu- 
tion as X*'^* with ^* = ip'xiO- Further, let a| be the variance of the residual for the linear 
regression of Y^ on X^ : 

Then we get the following result. 

Theorem 2.4. Let p,q,pn,qn G N* such that p/q (resp. Pn/Qn) H^s in Rx (resp. Rx(n)) and 
Pn/Qn p/q- Assume that 
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1. there exists tq > such that 

Bq:=\- ro,ro[C {domipY) n (n„>i domV'y(n)) , 
and let r (resp. Tn) he the unique real such that iP'x{t) =p/q (resp. 7p'^^„-j{Tn) = Pn/Qn), 

2. there exists r > such that I^- ■= [r — r,T + r] C {dom ipx) n (n„>i dom ipxi^)) o^n-d 



y s £ Ir , sup 



E 



0, (5) 



„sx(")+t'(y(")-ECKr*r')) 

3. {an) satisfies nanQn +oo. 



sup sup E ( e'^^--+ni'---E(^.VJi ) < oo , (6) 



Then the distribution of ( \/ :^{Tn — 'nqnK{Yjj^^)'j ) conditioned by the event {Sn = npn] satisfies 



l\2 

a LDP with good rate function J(-) = -k^ and speed an- 



Remark 2.5. As inqn) — "((2^)5 ^6 say that the distribution of ^{Tn - nqn^{Y}.^ 
conditioned by the event {Sn = npn} satisfies a moderate deviation principle (MDP). 

Corollary 2.6. Under the assumptions of one of the last theorems, Cn converges in distribution 
toward' s the degenerate distribution concentrated on E(Yp/qj. 

3 Proofs 

For pn G N, such that P(S'„ = npn) / 0, let 

fn{u) := — logE [exp(nT„) |5„ = np„] G M U {+cx)}. 
nqn 

In order to apply Gartner-Ellis Corollary, we have to prove that converges when n — > 00. 

The next two subsections yield a simple representation of fn{u) using the Fourier Transform. 

3.1 A simple representation using Fourier Transform 

Recall that we set (fzit) '■= E (e**^). An obvious but useful lemma follows. 

Lemma 3.1 (Bartlett's Formula, see Equation (16) in [3j). Let Z be a non negative integer r.v. 
and W be an integrable r.v. Then, for any non negative integer k lying in the support of Z, 

r mW eMitZ)]eM-ikt)dt 

E [W\Z = k] = \ \ — . 

J_^'^zit)exp{-ikt)dt 
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3.2 Laplace lemmas 

We begin this section with a variation on a Lemma first due to Laplace, see |10] . To be self 
contained we give also the sketch of its proof. 

Lemma 3.2. Let pn,qn,p,q £ N*. Let Z be a non constant square integrahle non negative integer 
r.v. with span mz = 1- Let (Z^^^) be a sequence of non negative i.i.d. integer random variables 
also having span 1. Let z[^\ . . . , Z^"^ be an i.i.d. sample distributed as Z^"). Assume that 



1. - tpz\ 



0, 



2. the means of Z and Z^") are rational, equal respectively top/q andpn/q-a withpn/q-n 



3. cr'^^(„) 



Z' 



n— »+oo 



4. E(|Z(") ~ Pn/qn\'^) is uniformly bounded. 
Then, when n tends to infinity 



(n) 



1 



npn 



y/27rnqnaz 



(l + o(l)). 



(7) 



Proof: The inversion of the Fourier Transform yields 



nq„ 



(n) 



npn 



-inp-nt nqn 



dt 
2^ 



e 9n ipz(n)[t) 



nqn dt 
2^' 



(8) 



On one hand, using a Taylor expansion of order 2 for e *'j"*(/?^(„)(i) = 9?^{n)_En (t), we get 



(/7^(n)_Pn (t) — 1 



2 " 



where t* lies in [0,t] and cr^ states for (T^(„) . Now as E|(Z(") — is bounded and as o"^ — > o"| 
we can find a positive number (independent of n) 5 < vr such that for |t| < (5 and for n large enough 



e ''''"^(Pz(n){t) = iPz(n)_Pn{t) 



< 1 



(9) 



On the other hand, as m^in) = mz = 1 one has both ^ := sups<t<n\^z{'t)\ < 1 and ^„ := 
sup5<t<^ |93^{„)(t)| < 1. Further, as \\(Pz{r^) — ^z\\oo 0, we get ^. Let e > be such that 

^ + e < 1. For n large enough < C + ^ • Now to conclude, one splits the integral in ([8]) in two 
integrals /i, I2 integrating on |t| < 5 and on |t| > 5. I/2I is bounded by + e)"'^"/(27r), hence 
is exponentially small. To deal with /i, one performs the variable change u = y/nq^azt, and use 
both ([9j) and inequality log(l — 9) < —6, (9 G [0, 1[) to conclude by using both central limit and 
Lebesgue Theorems. ■ 
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Remark 3.3. Note that if mz 7^ 1, the theorem above is in general not vahd. Take for example 
mz = 2 then in the interval [0, 27r[, ifz(„) (t) = ^z{t) = 1 if and only if t = or t = tt. Obviously, 

1 /■'T 1 /■37r/2 



Hence 



= ] = i y '^^l e-p"V^rn) m- (10) 

Set /n(i) := e~'''"-'P"^ip^,"){t)dt. Now, we split the integral in the right hand side of equation (fTO 



in five parts 



TT / fn{t)dt=— f„{t)dt + — f^{t)dt + — Ut)dt 

-j^ f'7V-\-S f'37T/2 

= /l+/2+/3+/4 + /5- 

Using the same arguments as above, we can prove that /i, Js and Is are exponentially small. 
We also get that ^ fn{t)dt = --^=1=^{1 + o(l)). Let us deal now with h = ^ f^^^ fn{t)dt. 

There exist non negative integer valued random variables such that Z^'^^ = 21^ + 6, set u = t — ir 

h = i-(-l)"P"+''"<?" f e-*"P""99"'^"(2M)dn. 
2vr J_i 

Hence I4 ^ ^= (1 + o(l)). 

We now give an extension of the previous lemma involving not only the probability for the 
sum to be equal to the mean of Z^"^ but to any good rational number. 

Lemma 3.4. Let Z he a non negative and non degenerated integer r.v. with span mz = 1. Let 
{Zj"^)j be a sequence of i.i.d. non negative integer random variables having also span 1. Let 
Pn,Qn,P,Q G N* such that pn/q-n S -Rz(n) — > p/q G Rz- Let r (resp. t^) be the unique real such 
that iI)'^{t) =p/q (resp. il)'^^„){Tn) =Pn/(ln)- We make the following assumptions. 

1. There exists r > such that := [r — r, r + r] C {rin>idom tpzin,)) n {dom ipz)- 



y s G Ir , lim sup 



E 



/{it+s)Z("^ _ ^{it+s)Z) 



) =0. (11) 



Then, when n goes to infinity 

nq„ 




-(1 + 0(1)), (12) 

y/^nrLqn(7z*,p/q 

where is the variance of Z*'^/'^ defined in (Op. 
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Proof: Using the multinomial formula, we may write 



nqn 



nq-n 



where r„ is the unique real such that tp'yir„) = Pn/ln- Hence 



nqn 



nqn 



npn = e z 



ng„i/'^(^j{p„/(5r„)p j ^ ^(n),*,pn/qn 



npn 



where 



{n),*,pn/qr, 



, . . . , z^")'*'^/'' are i.i.d. r.vs. having the distribution defined by 



Further, the expectation of this last r.v. is Pn/ln- Let us now check the assumptions of Lemma 

Assumption [2] of Lemma 13.21 is satisfied by construction of Z^")'*'^"/''". 
Let us prove that E ~Pn/(lnf' is bounded. Using Holder inequality we get that 

3 



E 



Zin),*,Pn/qn _ p^lq^ ^krn-^^^„^{rn)^^^(n) ^ 



fc=0 



Using classical arguments on convex functions [19j, we get that r, 



3/4 



n— >+oo 
|3 . 



E(e 

r. Hence, by 



Assumptions 1. and 2. of Lemma 13.41 we get that E \Z^^)^*^pIi - p/q\ is bounded. 
Similar arguments yield that 



(13) 



and that fi^, — > fi^, . 



Hence all the assumptions of Lemma 13.21 are satisfied and we may conclude using Lemma [3? 



3.3 Some changes of probability 

One of the main tool to prove large deviation results is the use of changes of probability. In this 
section, we review the different changes of probability used in this paper. 
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(a) Let p/q & Rz- We define r € dom ij^z by iIj'z{t) = p/q- We then introduce Z*'^/'^ as a 
random variable valued on N: 

P(Z*'P/5 = A:) =exp[A;r-^z(r)]P(Z = A;), (A: G N). (14) 

We have E(2'*'^/'^) = 2. This change of probability is quite classical in large deviation theory. 
In order to prove Lemma 13.41 we also define z("-)'*'P"/'J"^ replacing Z by p/q by Pn/qn 
and r by r„. Then E(Z(")'*'P"/'^") = Pn/qn-, as needed to apply Lemma[ 



(b) For u in dom^y, define by 

X = A:) = exp [-V'y(n)] E [exp {uY)^x=k}\ ■ (15) 



Similarly, replacing {X,Y) by (X*^"), y*^")) , we define xi""*. The r.v. and X^u^ appear 
naturally when applying the inversion of Fourier transform in the proof of Theorem [ 



(c) For the moderate deviations, the asymptotic is different (see Theorem 12. 4p . Therefore the 
r.v. y(") have to be centered. The centering factor and the rate function are closely related 
to the following change of probability. Let ^ lying in the interior of (iom.ipx and consider the 
random vector (X^, 1^) whose distribution is given, for any /c G N and real Borel set A, by 

¥{X^ = k,%^A)= exp[-^x(e) + ki]¥{X = k,Y (^A). (16) 

We define in the same way the random vector (X^""* , l^'-"'* ) . Obviously, X^ has the same 
distribution as X*'^ with ^* = ip'xiO- Further, let a| be the variance of the residual for 
the linear regression of on X^: 

2 

Note that J{y) = ^ is the rate function in Theorem 12.41 Moreover, the centering factor is 

E(1V^^). Hence, the change of probability used in the proof of Theorem 12.31 (see change of 
probability (jlSp above) has to be modified, according to this centering factor. This leads to 
the change of probability (jlSp below. 



(d) Let T (resp. r„) be such that ipxi"^) ~ P/l (I'^sp. 'ip'y(„){Tn) = Pn/qn)- Define the random 



(n) 

variable Xu distributed on N by: 



e^^^\x(n)=k} 



(18) 



where Y^""^ = - E(y4"^), Un = u/^/nd^ 
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3.4 Proof of Theorem [HI 

Let, for t e M and u G M, 



{t,u) := E fexp[itX('") + uF^")] 



On one hand, using Lemma l3.ll we may write, for n G M and n large enough, 
Using twice equation ([8]) we may rewrite ([T9|) as 



nqn 



logP 



(n) 



where X, 



(n) 



are independent copies of X^' defined in Subsection 13.31 by equation (fTSi) . 



(19) 



(20) 



In order to apply Lemma 13.41 to x'^J , let us prove that 



y s £ Ij- lim sup 



E 



0. 



(21) 



We have, for all s £ It, 

X") 



E 



E 



The right hand side of this last inequality tends to by assumption 3. of Theorem 12.31 It remains 
to prove that Pn/q-n (resp. p/q) belongs to R^{n) (resp. i?^ ). Using the fact that V'x.y is 
essentially smooth and Assumption 3. of Theorem 12.31 it is easy to see that Rj^ = Rx (resp. 



R 



X, 



(n) = Rx(^) &t least for n large enough). 
Applying Lemma 13.41 we obtain, for n G M 

/(u) := lim ^— logE [exp(nT„) |5„ = npn\ 
n^co nqn 



[<P*x: (p/q) - ^y{u) - Tp*x{p/q)]. 



The convex dual function /* of / is given by 



f*{y) ■■= sup[uy - f{u)] 



sup ( uy + 

■uei 



i^*x: (p/q) - i^Y{u) ) - tp*x{p/q) 



sup 

(n,C)edomi/)x,y 



P 

uy + i i:x,Y{i,u) 



'^xip/q) 



XY 



P 



-,y -i'xip/q)- 



(22) 



As il^xx is essentially smooth, using Theorem 26.3 in [19], we deduce that V'xy essentially 
strictly convex. Hence, using once more Theorem 26.3 in [19], we may deduce that / is essentially 
smooth. Therefore we can apply Gartner-Ellis Corollary 12.21 (see Theorem 2.3.6. (c) in |6]) and 
conclude. ■ 
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3.5 Proof of Theorem [231 

Let fn = Tn- ng„E(y4"'') and 

gn{u) = anlog (E(e^""/v^~^|5„ = npn 
Proceeding as in the proof of Theorem 12.31 we have 



gn{u) = an log 



J-TT x("),y(")-E(yT„ ) 



logP( ^X;"j = npn) - logP(S'n = npn)j + a„ng„^y („, -e(yv("> ) (^/V"««9") , 



where are i.i.d. r.v. on N with distribution defined in Subsection 13.31 by Equation (fTHl) . 



In order to use Lemma 13.41 we first have to prove that 
V s G /r , hm sup E 

We have 

.... N '^TM .... X J? 

< Ce 



0. 



(23) 



which tends to zero by assumption dSJ . As in the proof of Theorem 12.31 it is easy to prove that 
= Rx and i? = Rx(n) ■ 



Xu 



X. 



(n) 



Using Lemma 13.41 we obtain, for u G 



9n[u) ~ -Qnnan 



„{n) Tn I 



Define 



As 



and 



we get 



Hn{h) = sup 



i V' 



'4>l=^{Pn/qn) =SUp( — X - ■i/'o(n)(x) 



(n) 



9n[U) 



n— >+oo 



{Hn{u/ ^nanqn) - Hn{0)) 



(24) 
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We claim that if lim„ /i„ = 0, then 

hm ^ ^ In + 0(1). (25) 

Assuming that ([25]) is true, and as — > a^, we get that 

2 

Hmfl'„(u) = -v?-T-. 

n 2 

We easily conclude, since 

9*{y) =sup|My + lim5r„(n)| = 
It remains to prove that (j25p is true. Recall that 

In the sequel V'x (resp. V'y) will denote the partial derivative oi'^|J^^„^ y(n)_]E(y("))(?i ^) with respect 
to the first (resp. second) variable. On one hand, by assumption (l6|), we can define on x Bq 
the function F„ by: 

Fnii, h) = h) - pn/qn ■ 

We then deduce from the implicit function Theorem that there exists a neighborhood of (t„,0) 
on which: 

"Pti 

Hn{h) = in{h)Y - V'jf(u)^y(n)_E(yM)(^n(M,^) , 

with 



Hn{h) = sup 



We can then calculate the derivatives of (in the sequel we omit the argument {(,nih), h) in the 
derivatives). We have (with obvious notations) 



HUh) = -iP'y, 

1.11 \ 3 /„;.// \ 2 



(3) ( \ ( ^ y \ (3) f y A (3) '^x y (^3^ (^3^ 

Hn W = J '^x,x,x - 3 yMr~ j '^x,x,y + ^:^'^x,y,y " %,y,y ■ 

Replacing the partial derivative of ip by its expression, we get 

F;(0) = and H';{0) = -al . 
On the other hand, using a Taylor expansion, we get 

HniK) - Hn{0) = KH'M + + ^i^F(^n), zn G [0,K]. (26) 

2 
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Hence (1261) becomes 



HM-Hn{0) = -^^ + ^Hi^\zn), Zne[0,hn]. (27) 

2 D 

(3) 

Now the expression of m' is a rational fraction of some partial derivatives of 
E ^e^'''"^"'^'*^^^"''"'^^^^" . The denominator of this rational fraction is bounded away from as 

it converg 
is proved. 



it converges to a variance and numerator is bounded by 1^. Hence Hn is bounded and the claim 



4 Examples 

In this section we give two examples of applications and one counter example. These examples 
are borrowed from [15j. 

4.0.1 Occupancy problem 

In the classical occupancy problem (see [15J and the references therein for more details), m balls are 
distributed at random into N urns. The resulting numbers of balls Zi, - ■ ■ , Zjv have a multinomial 
distribution, and it is well-known that this equals the distribution of (^i, • • • , Xj\[^ conditioned on 
Y^j^i Xi = m, where Xi, • • • , X^ are i.i.d. with Xi ~ 'P(A), for an arbitrary A > 0. The classical 
occupancy problem studies the number W of empty urns; this is thus J2iLi ^{Xj=o} conditioned 
on J2iLi Xi = m. 

Now suppose that m = npn oo and N = ng„ —>■ oo with ^ g- Take X^'^'^ ~ 'P(A„). Note 
that we do not assume that A„ = Pn/Qn and X = p/q which is the case in Janson's work. It is 
easy to see that Assumption 3. of Theorem 12.31 is fulfilled and that iIjx,y is essentially smooth. 
Moreover, for {x,y) £ and {p,q) £ (M*)'^, we have 

il^xix) = -A + Ae^', 

'^*xip/Q) =P/9log(^) + A - -, 
qX q 

V'x,y (x, y) = -A + log (^e^^-P(^) - 1 + e^) . 

Hence we can apply Theorem l2.31 Here the function y does not have any explicit form. We give 
in Appendix the graph of the rate function for some particular values of p/q and A. Assumptions 
of Theorem 12.41 are obviously fulfilled. We have 

P(X = k) = e-^^^PM(e^A)VA:!. 

Hence Xj- is Poisson with parameter Ae^. An easy calculation gives 

cov{Xr,Yr) = -Ae^e-^"^P(^), 

Var(i;) = e-^''^PW(l - g-^'^^PW). 



13 



Hence a\ = e'^'^^PM (^l-e-^^^PW+Ae^e-^°^PW) . Now, as r = log(^), we get J(.) = 
in the particular case where A = p/g. Note that functional L.D.P. is given in [4]. 
Remark 4.1. Theorem 12.31 allows us to deal with other statistics than Y^=\ ^{Xj=o}- Fo^' example, 
statistics of the form XlILi /(^j' -^i) where Zi, . . . , Z„ are i.i.d. and independent from Xi, . . . , Xn- 
Let us describe the particular case of bootstrap see [7]. Let be i.i.d. real valued 

random variables, independent of . . . ,Xn- We choose at random with replacement a sample 
Zi, . . . , Z*. Then fi^i) is distributed as Xif{Zi) conditioned on Yll=i -^i — where 
Xi ~ 'P(A) , i = 1, . . . , n for any A > 0. Hence we get the same kind of conditioning as for the 
occupancy problem. 

4.0.2 Branching processes 

Consider a Galton- Watson process, beginning with one individual, where the number of children 
of an individual is given by a random variable X having finite moments. Assume further that 
E(X) = 1. We number the individuals as they appear. Let Xi be the number of children of the 
i—th individual. It is well known (see example 3.4 in |15j and the references therein) that the 
total progeny is n > 1 if and only if 

k 

5-^ := ^ > /c for < A: < n but = n - 1 . (28) 
1=1 

This type of conditioning is different from the one studied in the present paper, but Janson proves 
[T5l Example 3.4] that if we ignore the order of Xi, . . . , Xn, conditioning on ([281) is equivalent to 
conditioning on S'„ = n — 1. Hence we can study variables of the kind Yi = f{Xi). Considering 
the case where Yi = lL|Xj=3}, the Y17=i^i number of families with three children. Now 

choosing Xi ~ V{X), we compute the rate function as in Example 14.0.11 

4.0.3 Hashing 
The model 

Hashing with linear probing can be regarded as throwing n balls sequentially into m urns at 
random; the urns are arranged in a circle and a ball that lands in an occupied urn is moved to 
the next empty urn, always moving in a fixed direction. The length of the move is called the 
displacement of the ball, and we are interested in the sum of all displacements which is a random 
variable noted dm,n- We assume n < m. 

After throwing all balls, there are N = m — n empty urns. These divide the occupied urns into 
blocks of consecutive urns. For convenience, we consider the empty urn following a block as 
belonging to this block. Janson [T3] proved that the length of the blocks (counting the empty 
urn) and the sum of displacements inside each block are distributed as (Xi, Yi), . . . , (X^v, liv) 
(A^ = m — n) conditioned on X^i^i ^« — where (Xj,l^) are i.i.d. copies of a pair (X, y) of 
random variables. X has the Borel distribution 

P(^ = = 2^^^'' ^eN*, zG [0,e-M, (29) 

where T(A) = Yl'^i ^TT"-^' the well-known tree function and A is an arbitrary number with 
< A < e^^. The conditional distribution of Y given X = / is the same as the distribution of 
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di,i-i- 

Using Janson's results [51 114[ [TC]. unfortunately we can prove that the joint Laplace transform of 
{Xi,Yi) is defined only on {—oo,a) x (— oo,0) for some positive a. Hence our results can not be 
applied. Nevertheless, in a forthcoming work, we will study conditioned L.D.P for self-normalized 
sums in the spirit of [20]. In that case the Laplace will be defined. 



4.0.4 Bose-Einstein statistics 



This example is borrowed from [13]. Consider urns. Put n indistinguishable balls in the urns 
in such a way that each distinguishable outcome has the same probability i.e., 



1/ 



n + A^- 1 
n 



see for example [8] . Let Zf. be the number of balls in the kth urn. It is well known that {Zi , . . . , Z^) 
is distributed as (Xi,--- ,Xn-) conditioned on X^^i = n, where Xi,--- ,Xn are i.i.d. with 
a geometric distribution. As for Example 14.0. 1^ we can get a L.D.P for variables of the form 
h{Xi) if AoTni^h[x,) = M. 



4.0.5 Possible extensions 

Among possible extensions, let us mention the case where the variables Yi are independent but do 
not have the same distribution. This case occurs in [13^ Examples 2 and 3], where the quantity of 
interest is the law of ^^^i hi{Xi) conditioned on the event Xli^i -^i ~ Another way to extend 
our work is to deal with the case where the variables Xi are independent but not i.i.d.. This case 
occurs when counting from a random permutation the number of cycles of a fixed size see for 
example [H Chapter 1] 

In the present paper we assume that Pn,qn,P and q are positive. The other cases will be 
considered in a forthcoming work. 
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5 Appendix 



Here we give the shape of the large deviation rate function in the example of the Occupancy 
problem when A = 1 and p/q = 1, p/q = 0.4 or p/q = 3 

Rate function loi X = 1, p/q = 1 Rate function foi X = 1, p/q = 3 




0.1 0,2 3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.6 0.6 0.7 0.8 0.9 1 



Rate function for X = 1, p/q = 0.4 
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